Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
medRxiv ; 2024 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-38633810

RESUMO

Background: Early detection of cognitive decline in elderly individuals facilitates clinical trial enrollment and timely medical interventions. This study aims to apply, evaluate, and compare advanced natural language processing techniques for identifying signs of cognitive decline in clinical notes. Methods: This study, conducted at Mass General Brigham (MGB), Boston, MA, included clinical notes from the 4 years prior to initial mild cognitive impairment (MCI) diagnosis in 2019 for patients ≥ 50 years. Note sections regarding cognitive decline were labeled manually. A random sample of 4,949 note sections filtered with cognitive functions-related keywords were used for traditional AI model development, and 200 random subset were used for LLM and prompt development; another random sample of 1996 note sections without keyword filtering were used for testing. Prompt templates for large language models (LLM), Llama 2 on Amazon Web Service and GPT-4 on Microsoft Azure, were developed with multiple prompting approaches to select the optimal LLM-based method. Baseline comparisons were made with XGBoost and a hierarchical attention-based deep neural network model. An ensemble of the three models was then constructed using majority vote. Results: GPT-4 demonstrated superior accuracy and efficiency to Llama 2. The ensemble model outperformed individual models, achieving a precision of 90.3%, recall of 94.2%, and F1-score of 92.2%. Notably, the ensemble model demonstrated a marked improvement in precision (from a 70%-79% range to above 90%) compared to the best performing single model. Error analysis revealed 63 samples were wrongly predicted by at least one model; however, only 2 cases (3.2%) were mutual errors across all models, indicating diverse error profiles among them. Conclusion: Our findings indicate that LLMs and traditional models exhibit diverse error profiles. The ensemble of LLMs and locally trained machine learning models on EHR data was found to be complementary, enhancing performance and improving diagnostic accuracy.

2.
Sci Rep ; 14(1): 7831, 2024 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-38570569

RESUMO

The objective of this study is to develop and evaluate natural language processing (NLP) and machine learning models to predict infant feeding status from clinical notes in the Epic electronic health records system. The primary outcome was the classification of infant feeding status from clinical notes using Medical Subject Headings (MeSH) terms. Annotation of notes was completed using TeamTat to uniquely classify clinical notes according to infant feeding status. We trained 6 machine learning models to classify infant feeding status: logistic regression, random forest, XGBoost gradient descent, k-nearest neighbors, and support-vector classifier. Model comparison was evaluated based on overall accuracy, precision, recall, and F1 score. Our modeling corpus included an even number of clinical notes that was a balanced sample across each class. We manually reviewed 999 notes that represented 746 mother-infant dyads with a mean gestational age of 38.9 weeks and a mean maternal age of 26.6 years. The most frequent feeding status classification present for this study was exclusive breastfeeding [n = 183 (18.3%)], followed by exclusive formula bottle feeding [n = 146 (14.6%)], and exclusive feeding of expressed mother's milk [n = 102 (10.2%)], with mixed feeding being the least frequent [n = 23 (2.3%)]. Our final analysis evaluated the classification of clinical notes as breast, formula/bottle, and missing. The machine learning models were trained on these three classes after performing balancing and down sampling. The XGBoost model outperformed all others by achieving an accuracy of 90.1%, a macro-averaged precision of 90.3%, a macro-averaged recall of 90.1%, and a macro-averaged F1 score of 90.1%. Our results demonstrate that natural language processing can be applied to clinical notes stored in the electronic health records to classify infant feeding status. Early identification of breastfeeding status using NLP on unstructured electronic health records data can be used to inform precision public health interventions focused on improving lactation support for postpartum patients.


Assuntos
Aprendizado de Máquina , Processamento de Linguagem Natural , Feminino , Humanos , Lactente , Software , Registros Eletrônicos de Saúde , Mães
3.
J Am Soc Mass Spectrom ; 34(12): 2857-2863, 2023 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-37874901

RESUMO

Liquid chromatography-mass spectrometry (LC-MS) metabolomics studies produce high-dimensional data that must be processed by a complex network of informatics tools to generate analysis-ready data sets. As the first computational step in metabolomics, data processing is increasingly becoming a challenge for researchers to develop customized computational workflows that are applicable for LC-MS metabolomics analysis. Ontology-based automated workflow composition (AWC) systems provide a feasible approach for developing computational workflows that consume high-dimensional molecular data. We used the Automated Pipeline Explorer (APE) to create an AWC for LC-MS metabolomics data processing across three use cases. Our results show that APE predicted 145 data processing workflows across all the three use cases. We identified six traditional workflows and six novel workflows. Through manual review, we found that one-third of novel workflows were executable whereby the data processing function could be completed without obtaining an error. When selecting the top six workflows from each use case, the computational viable rate of our predicted workflows reached 45%. Collectively, our study demonstrates the feasibility of developing an AWC system for LC-MS metabolomics data processing.


Assuntos
Hominidae , Software , Animais , Fluxo de Trabalho , Metabolômica/métodos , Espectrometria de Massas , Cromatografia Líquida/métodos
4.
Nutrients ; 15(17)2023 Aug 29.
Artigo em Inglês | MEDLINE | ID: mdl-37686800

RESUMO

Epidemiological data demonstrate that bovine whole milk is often substituted for human milk during the first 12 months of life and may be associated with adverse infant outcomes. The objective of this study is to interrogate the human and bovine milk metabolome at 2 weeks of life to identify unique metabolites that may impact infant health outcomes. Human milk (n = 10) was collected at 2 weeks postpartum from normal-weight mothers (pre-pregnant BMI < 25 kg/m2) that vaginally delivered term infants and were exclusively breastfeeding their infant for at least 2 months. Similarly, bovine milk (n = 10) was collected 2 weeks postpartum from normal-weight primiparous Holstein dairy cows. Untargeted data were acquired on all milk samples using high-resolution liquid chromatography-high-resolution tandem mass spectrometry (HR LC-MS/MS). MS data pre-processing from feature calling to metabolite annotation was performed using MS-DIAL and MS-FLO. Our results revealed that more than 80% of the milk metabolome is shared between human and bovine milk samples during early lactation. Unbiased analysis of identified metabolites revealed that nearly 80% of milk metabolites may contribute to microbial metabolism and microbe-host interactions. Collectively, these results highlight untargeted metabolomics as a potential strategy to identify unique and shared metabolites in bovine and human milk that may relate to and impact infant health outcomes.


Assuntos
Aleitamento Materno , Espectrometria de Massas em Tandem , Animais , Feminino , Lactente , Gravidez , Humanos , Bovinos , Cromatografia Líquida , Lactação , Leite Humano , Metabolômica
5.
Metabolomics ; 19(2): 11, 2023 02 06.
Artigo em Inglês | MEDLINE | ID: mdl-36745241

RESUMO

BACKGROUND: Liquid chromatography-high resolution mass spectrometry (LC-HRMS) is a popular approach for metabolomics data acquisition and requires many data processing software tools. The FAIR Principles - Findability, Accessibility, Interoperability, and Reusability - were proposed to promote open science and reusable data management, and to maximize the benefit obtained from contemporary and formal scholarly digital publishing. More recently, the FAIR principles were extended to include Research Software (FAIR4RS). AIM OF REVIEW: This study facilitates open science in metabolomics by providing an implementation solution for adopting FAIR4RS in the LC-HRMS metabolomics data processing software. We believe our evaluation guidelines and results can help improve the FAIRness of research software. KEY SCIENTIFIC CONCEPTS OF REVIEW: We evaluated 124 LC-HRMS metabolomics data processing software obtained from a systematic review and selected 61 software for detailed evaluation using FAIR4RS-related criteria, which were extracted from the literature along with internal discussions. We assigned each criterion one or more FAIR4RS categories through discussion. The minimum, median, and maximum percentages of criteria fulfillment of software were 21.6%, 47.7%, and 71.8%. Statistical analysis revealed no significant improvement in FAIRness over time. We identified four criteria covering multiple FAIR4RS categories but had a low %fulfillment: (1) No software had semantic annotation of key information; (2) only 6.3% of evaluated software were registered to Zenodo and received DOIs; (3) only 14.5% of selected software had official software containerization or virtual machine; (4) only 16.7% of evaluated software had a fully documented functions in code. According to the results, we discussed improvement strategies and future directions.


Assuntos
Metabolômica , Software , Metabolômica/métodos , Cromatografia Líquida/métodos , Espectrometria de Massas/métodos , Gerenciamento de Dados
6.
Metabolites ; 12(1)2022 Jan 17.
Artigo em Inglês | MEDLINE | ID: mdl-35050209

RESUMO

Clinical metabolomics emerged as a novel approach for biomarker discovery with the translational potential to guide next-generation therapeutics and precision health interventions. However, reproducibility in clinical research employing metabolomics data is challenging. Checklists are a helpful tool for promoting reproducible research. Existing checklists that promote reproducible metabolomics research primarily focused on metadata and may not be sufficient to ensure reproducible metabolomics data processing. This paper provides a checklist including actions that need to be taken by researchers to make computational steps reproducible for clinical metabolomics studies. We developed an eight-item checklist that includes criteria related to reusable data sharing and reproducible computational workflow development. We also provided recommended tools and resources to complete each item, as well as a GitHub project template to guide the process. The checklist is concise and easy to follow. Studies that follow this checklist and use recommended resources may facilitate other researchers to reproduce metabolomics results easily and efficiently.

7.
JMIR Pediatr Parent ; 4(1): e23842, 2021 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-33666558

RESUMO

BACKGROUND: Electronic health records (EHRs) hold great potential for longitudinal mother-baby studies, ranging from assessing study feasibility to facilitating patient recruitment to streamlining study visits and data collection. Existing studies on the perspectives of pregnant and breastfeeding women on EHR use have been limited to the use of EHRs to engage in health care rather than to participate in research. OBJECTIVE: The aim of this study is to explore the perspectives of pregnant and breastfeeding women on releasing their own and their infants' EHR data for longitudinal research to identify factors affecting their willingness to participate in research. METHODS: We conducted semistructured interviews with pregnant or breastfeeding women from Alachua County, Florida. Participants were asked about their familiarity with EHRs and EHR patient portals, their comfort with releasing maternal and infant EHR data to researchers, the length of time of the data release, and whether individual research test results should be included in the EHR. The interviews were transcribed verbatim. Transcripts were organized and coded using the NVivo 12 software (QSR International), and coded data were thematically analyzed using constant comparison. RESULTS: Participants included 29 pregnant or breastfeeding women aged between 22 and 39 years. More than half of the sample had at least an associate degree or higher. Nearly all participants (27/29, 93%) were familiar with EHRs and had experience accessing an EHR patient portal. Less than half of the participants (12/29, 41%) were willing to make EHR data available to researchers for the duration of a study or longer. Participants' concerns about sharing EHRs for research purposes emerged in 3 thematic domains: privacy and confidentiality, transparency by the research team, and surrogate decision-making on behalf of infants. The potential release of sensitive or stigmatizing information, such as mental or sexual health history, was considered in the decisions to release EHRs. Some participants viewed the simultaneous use of their EHRs for both health care and research as potentially beneficial, whereas others expressed concerns about mixing their health care with research. CONCLUSIONS: This exploratory study indicates that pregnant and breastfeeding women may be willing to release EHR data to researchers if researchers adequately address their concerns regarding the study design, communication, and data management. Pregnant and breastfeeding women should be included in EHR-based research as long as researchers are prepared to address their concerns.

8.
BMC Pregnancy Childbirth ; 21(1): 67, 2021 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-33472584

RESUMO

BACKGROUND: Investigation of the microbiome during early life has stimulated an increasing number of cohort studies in pregnant and breastfeeding women that require non-invasive biospecimen collection. The objective of this study was to explore pregnant and breastfeeding women's perspectives on longitudinal clinical studies that require non-invasive biospecimen collection and how they relate to study logistics and research participation. METHODS: We completed in-depth semi-structured interviews with 40 women who were either pregnant (n = 20) or breastfeeding (n = 20) to identify their understanding of longitudinal clinical research, the motivations and barriers to their participation in such research, and their preferences for providing non-invasive biospecimen samples. RESULTS: Perspectives on research participation were focused on breastfeeding and perinatal education. Participants cited direct benefits of research participation that included flexible childcare, lactation support, and incentives and compensation. Healthcare providers, physician offices, and social media were cited as credible sources and channels for recruitment. Participants viewed lengthy study visits and child protection as the primary barriers to research participation. The barriers to biospecimen collection were centered on stool sampling, inadequate instructions, and drop-off convenience. CONCLUSION: Women in this study were interested in participating in clinical studies that require non-invasive biospecimen collection, and motivations to participate center on breastfeeding and the potential to make a scientific contribution that helps others. Effectively recruiting pregnant or breastfeeding participants for longitudinal microbiome studies requires protocols that account for participant interests and consideration for their time.


Assuntos
Aleitamento Materno/psicologia , Conhecimentos, Atitudes e Prática em Saúde , Gestantes/psicologia , Sujeitos da Pesquisa/psicologia , Manejo de Espécimes/psicologia , Adolescente , Adulto , Feminino , Florida , Humanos , Entrevistas como Assunto , Estudos Longitudinais , Pessoa de Meia-Idade , Motivação , Gravidez , Adulto Jovem
9.
Sci Total Environ ; 764: 143963, 2021 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-33385644

RESUMO

Consumption of licit and/or illicit compounds during sporting events has traditionally been monitored using population surveys, medical records, and law enforcement seizure data. This pilot study evaluated the temporal and geospatial patterns in drug consumption during a university football game from wastewater using liquid chromatography tandem mass spectrometry (LC-MS/MS). Untreated wastewater samples were collected from three locations within or near the same football stadium every 30 min during a university football game. This analysis leveraged two LCMS/ MS instruments (Waters Acquity TQD and a Shimadzu 8040) to analyze samples for 58 licit or illicit compounds and some of their metabolites. Bayesian multilevel models were implemented to estimate mass load and population-level drug consumption, while accounting for multiple instrument runs and concentrations censored at the lower limit of quantitation. Overall, 29 compounds were detected in at least one wastewater sample collected during the game. The 10 most common compounds included opioids, anorectics, stimulants, and decongestants. For compounds detected in more than 50% of samples, temporal trends in median mass load were correlated with the timing of the game; peak loads for cocaine and tramadol occurred during the first quarter of the game and for phentermine during the third quarter. Stadium-wide estimates of the number of doses of drugs consumed were rank ordered as follows: oxycodone (n = 3246) > hydrocodone (n = 2260) > phentermine (n = 513) > cocaine (n = 415) > amphetamine (n = 372) > tramadol (n = 360) > pseudoephedrine (n = 324). This analysis represents the most comprehensive assessment of drug consumption during a university football game and indicates that wastewater-based epidemiology has potential to inform public health interventions focused on reducing recreational drug consumption during large-scale sporting events.


Assuntos
Águas Residuárias , Poluentes Químicos da Água , Teorema de Bayes , Cromatografia Líquida , Humanos , Projetos Piloto , Detecção do Abuso de Substâncias , Espectrometria de Massas em Tandem , Universidades , Águas Residuárias/análise , Poluentes Químicos da Água/análise
10.
J Pediatr Surg ; 56(10): 1703-1710, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-33342603

RESUMO

PURPOSE: Necrotizing enterocolitis (NEC) and spontaneous intestinal perforation (SIP) are devastating diseases in preterm neonates, often requiring surgical treatment. Previous studies evaluated outcomes in peritoneal drain placement versus laparotomy, but the accuracy of the presumptive diagnosis remains unknown without bowel visualization. Predictive analytics provide the opportunity to determine the etiology of perforation and guide surgical decision making. The purpose of this investigation was to build and evaluate machine learning models to differentiate NEC and SIP. METHODS: Neonates who underwent drain placement or laparotomy NEC or SIP were identified and grouped definitively via bowel visualization. Patient characteristics were analyzed using machine learning methodologies, which were optimized through areas under the receiver operating characteristic curve (AUROC). The model was further evaluated using a validation cohort. RESULTS: 40 patients were identified. A random forest model achieved 98% AUROC while a ridge logistic regression model reached 92% AUROC in differentiating diseases. When applying the trained random forest model to the validation cohort, outcomes were correctly predicted. CONCLUSIONS: This study supports the feasibility of using a novel machine learning model to differentiate between NEC and SIP prior to any intended surgical interventions. LEVEL OF EVIDENCE: level II TYPE OF STUDY: Clinical Research Paper.


Assuntos
Enterocolite Necrosante , Doenças do Prematuro , Perfuração Intestinal , Enterocolite Necrosante/diagnóstico , Enterocolite Necrosante/cirurgia , Humanos , Recém-Nascido , Doenças do Prematuro/cirurgia , Perfuração Intestinal/diagnóstico , Perfuração Intestinal/etiologia , Perfuração Intestinal/cirurgia , Laparotomia , Aprendizado de Máquina , Estudos Retrospectivos
11.
Int J Med Inform ; 139: 104140, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32325370

RESUMO

BACKGROUND: Febrile neutropenia (FN) has been associated with high mortality among adults with cancer. Current systems for early detection of inpatient FN mortality are based on scoring indexes that require intensive physicians' subjective evaluation. OBJECTIVE: In this study, we leveraged machine learning techniques to build a FN mortality risk evaluation tool focused on FN admissions without physicians' subjective evaluation. METHODS: We used the National Inpatient Sample and Nationwide Inpatient Sample (NIS) that included mortality data among adult inpatients who were diagnosed with FN during a hospital admission. Machine learning techniques that we compared included linear models (ridge logistic regression and linear support vector machine) and non-linear models (gradient boosting tree and neural network). The primary outcome for this study was death among individuals with a recorded FN admission. Model comparison was evaluated based on areas under the receiver operating characteristic curve (AUROC) and model performance was estimated using 30 % test set created via stratified split. RESULTS: Our analysis detected 126,013 adult admissions within the NIS data that were diagnosed with FN, among which 5,856 were declared as deceased (4.6 %). Our machine learning results demonstrate linear models and non-linear models achieved areas under the receiver operating characteristic (AUROC) around 92 % in survival prediction. CONCLUSIONS: We developed machine learning models that do not require physicians' subjective evaluation for FN mortality risk prediction.


Assuntos
Neutropenia Febril/mortalidade , Mortalidade Hospitalar/tendências , Hospitalização/estatística & dados numéricos , Aprendizado de Máquina , Redes Neurais de Computação , Máquina de Vetores de Suporte , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Curva ROC , Taxa de Sobrevida
12.
J Med Internet Res ; 19(12): e414, 2017 12 13.
Artigo em Inglês | MEDLINE | ID: mdl-29237586

RESUMO

BACKGROUND: Social media is being used by various stakeholders among pharmaceutical companies, government agencies, health care organizations, professionals, and news media as a way of engaging audiences to raise disease awareness and ultimately to improve public health. Nevertheless, it is unclear what effects this health information has on laypeople. OBJECTIVE: This study aimed to provide a detailed examination of how promotional health information related to Lynch syndrome impacts laypeople's discussions on a social media platform (Twitter) in terms of topic awareness and attitudes. METHODS: We used topic modeling and sentiment analysis techniques on Lynch syndrome-related tweets to answer the following research questions (RQs): (1) what are the most discussed topics in Lynch syndrome-related tweets?; (2) how promotional Lynch syndrome-related information on Twitter affects laypeople's discussions?; and (3) what impact do the Lynch syndrome awareness activities in the Colon Cancer Awareness Month and Lynch Syndrome Awareness Day have on laypeople's discussions and their attitudes? In particular, we used a set of keywords to collect Lynch syndrome-related tweets from October 26, 2016 to August 11, 2017 (289 days) through the Twitter public search application programming interface (API). We experimented with two different classification methods to categorize tweets into the following three classes: (1) irrelevant, (2) promotional health information, and (3) laypeople's discussions. We applied a topic modeling method to discover the themes in these Lynch syndrome-related tweets and conducted sentiment analysis on each layperson's tweet to gauge the writer's attitude (ie, positive, negative, and neutral) toward Lynch syndrome. The topic modeling and sentiment analysis results were elaborated to answer the three RQs. RESULTS: Of all tweets (N=16,667), 87.38% (14,564/16,667) were related to Lynch syndrome. Of the Lynch syndrome-related tweets, 81.43% (11,860/14,564) were classified as promotional and 18.57% (2704/14,564) were classified as laypeople's discussions. The most discussed themes were treatment (n=4080) and genetic testing (n=3073). We found that the topic distributions in laypeople's discussions were similar to the distributions in promotional Lynch syndrome-related information. Furthermore, most people had a positive attitude when discussing Lynch syndrome. The proportion of negative tweets was 3.51%. Within each topic, treatment (16.67%) and genetic testing (5.60%) had more negative tweets compared with other topics. When comparing monthly trends, laypeople's discussions had a strong correlation with promotional Lynch syndrome-related information on awareness (r=.98, P<.001), while there were moderate correlations on screening (r=.602, P=.05), genetic testing (r=.624, P=.04), treatment (r=.69, P=.02), and risk (r=.66, P=.03). We also discovered that the Colon Cancer Awareness Month (March 2017) and the Lynch Syndrome Awareness Day (March 22, 2017) had significant positive impacts on laypeople's discussions and their attitudes. CONCLUSIONS: There is evidence that participative social media platforms, namely Twitter, offer unique opportunities to inform cancer communication surveillance and to explore the mechanisms by which these new communication media affect individual health behavior and population health.


Assuntos
Neoplasias Colorretais Hereditárias sem Polipose/terapia , Promoção da Saúde/métodos , Saúde Pública/métodos , Mídias Sociais/estatística & dados numéricos , Neoplasias Colorretais Hereditárias sem Polipose/patologia , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...